skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Tang, Yue"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. FPGA-based edge servers are used in many applications in smart cities, hospitals, retail, etc. Equipped with heterogeneous FPGA-based accelerator cards, the servers can be implemented with multiple tasks including efficient video prepossessing, machine learning algorithm acceleration, etc. These servers are required to implement inference during the daytime while re-training the model during the night to adapt to new environments, domains, or new users. During the re-training, conventionally, the incoming data are transmitted to the cloud, and then the updated machine learning models will be transferred back to the edge server. Such a process is inefficient and cannot protect users’ privacy, so it is desirable for the models to be directly trained on the edge servers. Deploying convolutional neural network (CNN) training on heterogeneous resource-constrained FPGAs is challenging since it needs to consider both the complex data dependency of the training process and the communication bottleneck among different FPGAs. Previous multi-accelerator training algorithms select optimal scheduling strategies for data parallelism, tensor parallelism, and pipeline parallelism. However, pipeline parallelism cannot deal with batch normalization (BN) which is an essential CNN operator, while purely applying data parallelism and tensor parallelism suffers from resource under-utilization and intensive communication costs. In this work, we propose MTrain, a novel multi-accelerator training scheduling strategy that transfers the training process into a multi-branch workflow, thus independent sub-operations of different branches are executed on different training accelerators in parallelism for better utilization and reduced communication overhead. Experimental results show that we can achieve efficient CNN training on heterogeneous FPGA-based edge servers with 1.07x-2.21x speedup under 15 GB/s peer-to-peer bandwidth compared to the state-of-the-art work. 
    more » « less
    Free, publicly-accessible full text available January 1, 2026
  2. Free, publicly-accessible full text available November 1, 2025
  3. DNNs are rapidly evolving from streamlined singlemodality single-task (SMST) to multi-modality multi-task (MMMT) with large variations for different layers and complex data dependencies among layers. To support such models, hardware systems also evolved to be heterogeneous. The heterogeneous system comes from the prevailing trend to integrate diverse accelerators into the system for lower latency. FPGAs have high computation density and communication bandwidth and are configurable to be deployed with different designs of accelerators, which are widely used for various machinelearning applications. However, scaling from SMST to MMMT on heterogeneous FPGAs is challenging since MMMT has much larger layer variations, a massive number of layers, and complex data dependency among different backbones. Previous mapping algorithms are either inefficient or over-simplified which makes them impractical in general scenarios. In this work, we propose CHEF to enable efficient implementation of MMMT models in realistic heterogeneous FPGA clusters, i.e. deploying heterogeneous accelerators on heterogeneous FPGAs (A2F) and mapping the heterogeneous DNNs on the deployed heterogeneous accelerators (M2A). We propose CHEF-A2F, a two-stage accelerators-toFPGAs deployment approach to co-optimize hardware deployment and accelerator mapping. In addition, we propose CHEFM2A, which can support general and practical cases compared to previous mapping algorithms. To the best of our knowledge, this is the first attempt to implement MMMT models in real heterogeneous FPGA clusters. Experimental results show that the latency obtained with CHEF is near-optimal while the search time is 10000X less than exhaustively searching the optimal solution. 
    more » « less
  4. The existence of a quantum critical point (QCP) and fluctuations around it are believed to be important for understanding the phase diagram in unconventional superconductors such as cuprates, iron pnictides, and heavy fermion superconductors. However, the QCP is usually buried deep within the superconducting dome and is difficult to investigate. The connection between quantum critical fluctuations and superconductivity remains an outstanding problem in condensed matter. Here combining both electrical transport and Nernst experiments, we explicitly demonstrate the onset of superconductivity at an unconventional QCP in gate-tuned monolayer tungsten ditelluride ( WTe 2 ) , with features incompatible with the conventional Bardeen-Cooper-Schrieffer scenario. The results lead to a superconducting phase diagram that is distinguished from other known superconductors. Two distinct gate-tuned quantum phase transitions are observed at the ends of the superconducting dome. We find that quantum fluctuations around the QCP of the underdoped regime are essential for understanding how the monolayer superconductivity is established. The unconventional phase diagram we report here illustrates a previously unknown relation between superconductivity and QCP. Published by the American Physical Society2025 
    more » « less
    Free, publicly-accessible full text available February 1, 2026
  5. Introducing superconductivity in topological materials can lead to innovative electronic phases and device functionalities. Here, we present a unique strategy for quantum engineering of superconducting junctions in moiré materials through direct, on-chip, and fully encapsulated 2D crystal growth. We achieve robust and designable superconductivity in Pd-metalized twisted bilayer molybdenum ditelluride (MoTe2) and observe anomalous superconducting effects in high-quality junctions across ~20 moiré cells. Unexpectedly, the junction develops enhanced, instead of weakened, superconducting behaviors, exhibiting fluctuations to a higher critical magnetic field compared to its adjacent Pd7MoTe2superconductor. In addition, the critical current further exhibits a notable V-shaped minimum at zero magnetic field. These features are unexpected in conventional Josephson junctions and absent in junctions of natural bilayer MoTe2created using the same approach. We discuss implications of these observations, including the possible formation of mixed even- and odd-parity superconductivity at the moiré junctions. Our results also demonstrate a pathway to engineer and investigate superconductivity in fractional Chern insulators. 
    more » « less
    Free, publicly-accessible full text available January 31, 2026
  6. Two-dimensional (2D) transition metal dichalcogenides (TMDs) is a versatile class of quantum materials of interest to various fields including, e.g., nanoelectronics, optical devices, and topological and correlated quantum matter. Tailoring the electronic properties of TMDs is essential to their applications in many directions. Here, we report that a highly controllable and uniform on-chip 2D metallization process converts a class of atomically thin TMDs into robust superconductors, a property belonging to none of the starting materials. As examples, we demonstrate the introduction of superconductivity into a class of 2D air-sensitive topological TMDs, including monolayers of T d WTe 2 , 1 T MoTe 2 , and 2 H MoTe 2 , as well as their natural and twisted bilayers, metallized with an ultrathin layer of palladium. This class of TMDs is known to exhibit intriguing topological phases ranging from topological insulator, Weyl semimetal to fractional Chern insulator. The unique, high-quality two-dimensional metallization process is based on our recent findings of the long-distance, non-Fickian in-plane mass transport and chemistry in 2D that occur at relatively low temperatures and in devices fully encapsulated with inert insulating layers. Highly compatible with existing nanofabrication techniques for van der Waals stacks, our results offer a route to designing and engineering superconductivity and topological phases in a class of correlated 2D materials. Published by the American Physical Society2024 
    more » « less
  7. Optical spectroscopy of quantum materials at ultralow temperatures is rarely explored, yet it may provide critical characterizations of quantum phases not possible using other approaches. We describe the development of a novel experimental platform that enables optical spectroscopic studies, together with standard electronic transport, of materials at millikelvin temperatures inside a dilution refrigerator. The instrument is capable of measuring both bulk crystals and micrometer-sized two-dimensional van der Waals materials and devices. We demonstrate its performance by implementing photocurrent-based Fourier transform infrared spectroscopy on a monolayer WTe2 device and a multilayer 1T-TaS2 crystal, with a spectral range available from the near-infrared to the terahertz regime and in magnetic fields up to 5 T. In the far-infrared regime, we achieve spectroscopic measurements at a base temperature as low as ∼43 mK and a sample electron temperature of ∼450 mK. Possible experiments and potential future upgrades of this versatile instrumental platform are envisioned. 
    more » « less